Introduction
Sorghum [Sorghum bicolor (L. Moench)] is the world’s fifth most important cereal crop, in terms of both production and area planted. It is an increasingly relevant grain crop due to its resilience to drier and hotter climates. In the United States, sorghum is typically grown in dryland areas from South Dakota to South Texas and Texas accounts for ~1.8 million of the United States’ ~5.8 million acres of sorghum production.
The idea for this visualization project is to use the Texas A&M Variety Testing and USDA NASS data repositories on grain sorghum production to visualize yield and production trends for the state of Texas.
Datasets
For this project we plan to primarily use two main datasets. The first is the TAES dataset from Texas A&M Agrilife, which contains data from 2021 Texas Sorghum Variety Trials. The second is the dataset from the USDA National Agricultural Statistics website. This dataset is helpful because we can adjust this data for different time periods, counties, etc.
TAES Dataset: This dataset contains data from multi-environment trials of commercially released sorghum hybrids grown from 1970-2021 across different counties/cities around Texas.
Characteristics/Attributes: This dataset is extensive and has 50+ columns describing hybrid sorghum performance and the management practices used to grow each trial. Some important variables include…
- Year - Year survey data point was obtained
- County - Texas County Name
- Hybrid - Specific sorghum genotype
- Irrigation - Irrigation level/amount
- Brand - Company brand of sorghum plant/seed
- Herbicide - Substances used to protect crop
- Phosphorus - amount of phosphorus applied in fertilizer
- Nitrogen - amount of nitrogen applied in fertilizer
- Potassium - amount of potassium applied in fertilizer
Other less ancillary variables…
- Avg % Bird Damage - average percentage of crop lost due to birds
- Avg pH - average amount of acidity in the field
At first glance, we have determined that these columns possibly hold interesting data/insights. What our team deems important may change in the future as this project progresses.
USDA Dataset: This database is more queryable and contains additional information on not only sorghum but other crops as well. Characteristics/Attributes:
- Year - Year survey data point was obtained.
- County - Texas county name
- Commodity - Type of crop
- Value - Measure of crop peroformance (yield, area harvested, etc.)
- CV(%) - The coefficient of variation
As in the former dataset, important variables may change as we explore the data further.
Story:
- What are the most important factors that contribute to sorghum crop yield in the counties of Texas?
- How has sorghum production and performance changed over time?
- What variables have caused this change?
Audience: Farmers, Students, Agricultural institutions.
Attributes: See dataset descriptions above.
Metrics: Correlations (X vs Y), linear regressions, error calculations, future crop yield/outputs.
Inspiration
First, there already exists at least two examples of mapping the county-level production of sorghum acres. The USDA National Agricultural Statistics Service’s reported maps and Zachary Stansell’s tool to map USDA crop data which uses these APIs.


Further, we are motivated to go beyond a static map and create a dynamic choropleth map like Zachary Labe’s depiction of Arctic Sea Ice Volume/Thickness. This is a gif but an interactive approach could allow you to slide through years and see the changing geographic heat map of Texas’ grain sorghum trends. Lastly, the regions of texas could provide the categories for a stream graph depicting a variable’s distribution over the years.

Sketch
